📝 Longest Common Subsequence (Plagiarism Detection)

Help Emma detect similarities in student documents by finding the LCS

👩‍💻 Emma's Plagiarism Detection Challenge

🎯 The Mission:

Emma needs to find all unique Longest Common Subsequences (LCS) between two document strings and their length to detect similarities.

📋 Requirements:

  • Compute all unique LCS strings between two documents
  • Calculate the length of the LCS (including spaces)
  • Handle strings of length 0 to 1000
  • Output unique LCS strings followed by length

Input/Output Specifications

  • Input: Two strings, doc1 and doc2 (0 ≤ length ≤ 1000)
  • Output: All unique LCS strings, followed by "The length of the Longest Common Subsequence is: "

Example: doc1 = "This is test", doc2 = "is test This"

doc1:

T
h
i
s
i
s
t
e
s
t

doc2:

i
s
t
e
s
t
T
h
i
s

LCS: "is test" (Length: 7)

⚡ LCS Explained

How LCS Works

  1. Dynamic Programming: Build a DP table to find the length of LCS
  2. Backtrack: Trace back to find all unique LCS strings
  3. Handle Spaces: Include spaces in the sequence and length calculation

DP Table Example (This is test, is test This)

is test This
0000000000000
T0000000001111
h0000000001222
i0111111111233
s0122222222234
0123333333334
t0123444444444
e0123455555555
s0123456666666
t0123456777777

LCS: "is test" (Length: 7)

Time Complexity

O(mn)

For building DP table

Space Complexity

O(mn)

For DP table and results

Why LCS for Plagiarism?

  • ✅ Detects common phrases in texts
  • ✅ Handles spaces and punctuation
  • ✅ Finds all possible LCS strings
  • ❌ May have multiple LCS solutions

🔍 Step-by-Step LCS Demo

Click "Start Demo" to begin step-by-step visualization

Algorithm Progress:

1. Display input documents
2. Build DP table
3. Backtrack for all LCS
4. Display results

Current Documents:

doc1:

doc2:

DP Table:

🎮 Practice LCS

Enter documents and click "Find LCS"

Test Cases

Example 1: doc1 = "This is test", doc2 = "is test This" → LCS: "is test", Length: 7

Example 2: doc1 = "The brown quick fox", doc2 = "The quick brown fox" → LCS: "The brown fox", "The quick fox", Length: 13

📊 Algorithm Analysis

LCS Process

  1. DP Table: Build a table to compute LCS length
  2. Backtrack: Find all unique LCS strings
  3. Handle Duplicates: Store only unique LCS sequences

Time Complexity

O(mn)

For building DP table

Space Complexity

O(mn)

For DP table and results

Key Points

  • Dynamic Programming: Efficiently computes LCS length
  • Backtracking: Finds all possible LCS strings
  • Uniqueness: Stores only unique sequences
  • Applications: Plagiarism detection, text comparison
  • Limitation: Multiple LCS strings possible